Method and system for a user-following interface

ABSTRACT

Techniques are disclosed for projecting an image onto a surface suitable for interaction with a user while avoiding user occlusion, and while correcting for distortion due to oblique projection. The displayed image moves to a suitable surface at a suitable size and orientation as a user moves around an environment, resulting in a user-following interface. Surfaces are selected in which the projected interface is not occluded by the user or other objects in the environment. Displayed images may be interactive, and moved into an interaction area on a suitable surface that is convenient for the user. The interaction area may or may not coincide with the display area. Adaptation of the projected interface is allowed so that the content of the display and the style of interaction widgets are modified based on distance from the user and orientation of the user with respect to a projected interface.

FIELD OF THE INVENTION

[0001] The present invention relates to video projection and sensingsystems and, more particularly, relates to projecting and sensingtechniques that automatically move and adapt a projected interface basedon a location of a user in an environment.

BACKGROUND OF THE INVENTION

[0002] Today, people are very dependent on display devices, which serveas sources of information and entertainment during daily activities. Inmany offices, people spend hours sitting in front of a monitor. At home,people sit in front of a television. People are forced to go to wherethe display device is in order to interact with the display device.Additionally, “interaction” with a display device is very rudimentary,even today. For instance, some display devices offer interaction througha remote control, and others offer tactile interaction. However, thereare very few display devices that offer direct interaction with an imageproduced by the display device.

[0003] It would be beneficial to provide more interaction between peopleand display devices. There are systems involving multiple projectors forrealizing large-scale displays. Such systems are discussed in Welch etal., “Projected Imagery in Your Office of the Future,” IEEE ComputerGraphics and Apps, 62-67 (2000) and Sukthankar et al., “SmarterPresentations: Exploiting Homography in Camera-Projector Systems,” Proc.of Int'l Conf. on Computer Vision, Vancouver, Canada (2001), thedisclosures of which are hereby incorporated by reference. However, theprojected images cannot be moved by these systems. Additionally, Shaferet al., “The Easy Living Intelligent Environment System,” Proc. of theCHI Workshop on Research Directions in Situated Computing (2000), thedisclosure of which is hereby incorporated by reference, discusses anenvironment in which cameras are used to track a person and a deviceclose to a person is activated. However, this also does not allowinteraction between a person and a display device.

[0004] Thus, what are needed are techniques that provide more humaninteraction with display devices.

SUMMARY OF THE INVENTION

[0005] The present invention solves the problems of the prior art by, ingeneral, tracking users in an environment, and then providinginformation to a projector in order to create a projected interface on asuitable surface at a suitable position and, optionally, orientation.The suitability of the surfaces, positions, and orientations of theinterface is determined by the position and, optionally, the orientationof a user in the environment. The projected interface comprises one ormore displayed images, and generally comprises a display area and aninteraction area. Thus, aspects of the present invention allow a user tointeract with a display device.

[0006] Additionally, in aspects of the present invention, techniques areprovided for automatically discovering areas for display and interactionand avoiding occlusion of the interface by, illustratively, performinggeometric reasoning based on one or more models of projection, userposition, and surfaces in the environment.

[0007] Moreover, aspects of the present invention allow adaptation ofthe content of the projected interface and the style and placement ofthe interaction “widgets” based on user position and orientation. Aninteraction widget is any portion of an image suitable for humaninteraction. Importantly, aspects of the present invention modify theprojected interface so that it is convenient for interaction by beingvisible and reachable by the user. The invention can also, whennecessary, separate the surfaces selected for display and interaction.

[0008] Thus, the present invention allows a projected interface toautomatically appear close to the user on ordinary surfaces in anenvironment.

[0009] A more complete understanding of the present invention, as wellas further features and advantages of the present invention, will beobtained by reference to the following detailed description anddrawings.

BRIEF DESCRIPTION OF THE DRAWINGS

[0010]FIGS. 1A through 1E are illustrations of a user-followinginterface interacting with a user in an environment, in accordance withone embodiment of the present invention;

[0011]FIG. 2 is a block diagram of a user-following interface system andassociated devices in accordance with one embodiment of the presentinvention;

[0012]FIG. 3 is a flowchart of a method for tracking users and providingusers with an interface, in accordance with one embodiment of thepresent invention;

[0013]FIG. 4 is an illustration of the geometric reasoning fordetermining the display area on a selected surface, in accordance withone embodiment of the present invention;

[0014]FIG. 5 is an illustration of the geometric reasoning for selectinga non-occluded display area, in accordance with one embodiment of thepresent invention;

[0015]FIG. 6 is a flow chart of a method for analyzing camera images inone embodiment of a user tracking module, in accordance with oneembodiment of the present invention;

[0016]FIG. 7 is a flow chart of a method for determining userinteraction with a displayed image, in accordance with one embodiment ofthe present invention;

[0017]FIG. 8 is a flow chart of a method for modifying a displayedimage, in accordance with one embodiment of the present invention;

[0018]FIG. 9 is an illustration of a three-dimensional model of anenvironment with display areas overlaid, in accordance with oneembodiment of the present invention;

[0019]FIG. 10 is an illustration of tracking of an individual in acamera image, in accordance with one embodiment of the presentinvention; and

[0020]FIG. 11 is an illustration of the mapping of defined display andinteraction areas from application space to real space, in accordancewith one embodiment of the present invention.

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS

[0021] The present invention allows a user in an environment to interactwith and be followed by displayed images. The images generally comprisean image on a display area and an image on an interaction area, wherethe display and interaction area may be separate areas, the same area,or overlapping areas. As a user moves around the environment, thepresent invention moves the displayed images to suitable surfaces.Generally, the suitable surfaces are chosen by proximity to the user andby orientation of the user. The images to be displayed are beneficiallydistorted so that the displayed images will be relatively undistortedfrom the perspective of the user. The user interacts with an image, andits interaction widgets, displayed on the interaction area. It should benoted that an interaction area may not have an image displayed on it.For example, a user could interact with a displayed image by moving hisor her hand over a table. In this example, a display area contains adisplayed image, but the interaction area does not. A steerable cameraor other suitable device is used to view the movements by the user, butthere is no displayed image on the interaction area. The presentinvention also may detect when a user occludes a displayed image, e.g.,a display area or interaction area or both, and the present inventionwill move the displayed image to a different surface in order to preventfurther occlusion.

[0022] Before proceeding with additional description of the presentinvention, it is beneficial to analyze conventional display techniquesand their problems. As described above, people tend to have to go to adisplay device in order to interact with the display device. Asdiscussed above, although it is beneficial for a displayed image tofollow the users as they move and function in an environment,conventional techniques do not allow this. For instance, U.S. Pat. No.6,431,711, entitled “Multiple-surface display projector with interactiveinput capability,” issued to Claudio Pinhanez (2002) (hereinafter,“Pinhanez”), the disclosure of which is hereby incorporated byreference, teaches methods and apparatus for projecting an image ontoany surface in a room and distorting the image before projection so thata projected version of the image will not be distorted. Pinhanez alsoteaches methods for allowing a person to interact with a projectedimage.

[0023] However, the multiple-surface display projector of Pinhanez doesnot have knowledge of the location of a user in an environment, nor ofthe three-dimensional geometry of the environment. Such amultiple-surface display projector has to be pre-calibrated to projecton specific areas on specific surfaces in the environment and must bedirected manually or programmatically to switch the displayed image fromone pre-calibrated surface to the other. The projector of Pinhanezcannot automatically move the displayed image to the appropriate surfaceat the appropriate position, size and orientation based on the positionand orientation of the user. The system cannot discover or define newareas used for displaying images and can only use the pre-calibratedareas.

[0024] Another problem with the multiple-surface display projector ofPinhanez, and with projection systems in general, is that they cannotdetect when the user is occluding an image. These systems cannot changethe position of the displayed image or adapt the content of thedisplayed image to compensate for user occlusion of the displayed image.

[0025] While the multiple-surface display projector of Pinhanez allowsfor user interaction with the projected image, Pinhanez does not provideways for adapting the interaction interface based on the location of theuser. Pinhanez also does not teach ways to specify an interface inapplication space and move the interface to arbitrary surfaces.

[0026] Thus, a user of a multiple-surface display projector system oftenfaces several problems, such as the displayed image being in aninappropriate location at an inappropriate size and orientation, or theinteraction widgets being too small or too large or unreachable, or thedisplayed image and/or the interaction widgets being occluded by theuser or other objects in the environment.

[0027] What is provided by the present invention are techniques thatsolve these problems and that (i) automatically move a displayed imageto an appropriate surface at an appropriate size and orientation basedon the position and, optionally, orientation of a user; (ii)automatically detect and avoid occlusions; and (iii) automatically adaptthe location of the interaction area and the content of the interactioninterface based on the user location and available surfaces.

[0028]FIG. 1A is an illustration of user-following interface, inaccordance with the present invention, interacting with a user in anenvironment. The figure shows an environment 110. The environment 110comprises many surfaces and objects, and in particular, wall cabinets120, 130, 140, and the wall 150. The environment also comprises auser-following interface system 200. Furthermore, there are also camerasand projectors (not shown for simplicity) that are generally used withuser-following interface system 200. The environment 110 shows a path170 taken by a user 190 as he moves in the environment 110. The path 170particularly points out four positions marked as 1, 2, 3, and 4, whichthe user traverses as he travels along the path 170. The user-followinginterface system 200 will move an image on particular surfaces as theuser traverses path 170.

[0029] In particular, when the user is at position 1, as shown in FIG.1B by portion 111 of environment 110, the user-following interfacesystem 200 moves a displayed image 180 to the surface of wall cabinet120. The position of the user is important to where the user-followinginterface system 200 moves the displayed image 180. Additionally, theorientation 195 of the user may be used to determine to which surfacethe image should be moved. This is described in more detail below. Whenthe user moves to position 2, as shown in FIG. 1C by portion 112 ofenvironment 110, the user-following interface system 200 moves thedisplayed image 180 to the cabinet 130, and moves the displayed image180 to the cabinet 240 when the user moves to position 3, as shown inFIG. 1D by portion 113 of environment 110. When the user moves toposition 4, as shown in FIG. 1E by portion 114 of environment 110, theuser-following interface system 200 moves the displayed image 180 to thewall 150. Note that the user-following interface system 200 moves theimage to an appropriate surface that is available and close to the user,that the size and orientation of the displayed image 180 can change, andthat the displayed image 180 is positioned to avoid occlusion by theuser or other objects in the environment. An occlusion is a blockage ofthe displayed image 180, such that a portion or all of the displayedimage 180 is shaded by an object or user that intervenes between theprojector and the displayed image 180.

[0030]FIG. 2 is a block diagram of an exemplary user-following interfacesystem 200, shown interacting with a computer system 299, a steerableprojector 260, multiple tracking devices 205-1 through 205-n(collectively, “tracking devices 205”), and a steerable camera 280, inaccordance with one embodiment of the present invention. User-followinginterface system 200 comprises a user tracking module 210, auser-following interface controller 225, an environment model 230, agraphics controller 265, an interface adapter 270, and an interactiondetector 275. Computer system 299 comprises application 294, which iscoupled to the user-following interface system 200.

[0031] In this embodiment, multiple tracking devices 205 are used asmeans for determining user location and orientation in an environment.The tracking devices 205 can be cameras, electromagnetic devices, oractive badges. For instance, Radio Frequency Identification (RFID)devices could be used as an electromagnetic device. There are a varietyof active badges that may be worn by users and will transmit thelocation and sometimes orientation of the users. Some examples of suchbadges and systems are as follows. In an electromagnetic sensing system,a transmitting antenna in a fixed location generates axial directcurrent magnetic-field pulses. Receiving antennas are worn by the users.The system computes position and orientation of the receiving antennasby measuring the response in three orthogonal axes to the transmittedpulse, combined with the constant effect of the magnetic field of theearth. In a combined Radio Frequency (RF)/ultrasound system, users carryRF/ultrasound emitters, each of which sends a unique RF/ultrasound pulseon receiving a radio-frequency trigger signal. Receivers mounted on theceiling also receive the trigger signal and the pulses from theemitters. The location of an emitter is estimated by the delay betweenarrival of the trigger signal and the emitted pulse at multiplereceivers.

[0032] In the example of FIG. 2, cameras will be assumed to be used fortracking devices 205, but any device suitable for tracking a human maybe used as a tracking device 205. The cameras 205 are input to a usertracking module 210 that analyzes images from cameras 205 to detect thepresence of people in the environment observed by the cameras and thatdetermines the position and, optionally, orientation of the detectedpeople (called “users”). The user position and orientation parameters215 are read by the user-following interface controller 225. Theuser-following interface controller 225 also has access to anenvironment model 230, which has a representation of thethree-dimensional surfaces and objects in the environment (such asenvironment 210).

[0033] The environment model 230 is beneficially built beforehand basedon measurements made in the environment manually or through automated orsemi-automated means. The user-following interface controller 225queries the environment model 230 to obtain specific surfaces andobjects in the environment model 230. The user-following interfacecontroller 225 also provides updates on user location as well as defineddisplay areas to the environment model 230. FIG. 9, described below,shows an example of a three-dimensional model of an environment withseveral defined display areas.

[0034] The user-following interface controller 225 also receivescalibration parameters 220 for the various cameras 205 and 280 and forthe steerable projector 260. These calibration parameters 220 give thethree-dimensional position and orientation, as well as the zoom andfocus parameters of the cameras and projectors. The user-followinginterface controller 225 performs geometric reasoning based on the userposition and orientation parameters 215, calibration parameters 220, andthe three-dimensional surface and object parameters 232. Based on thisgeometric reasoning, the user-following interface controller 225determines where the interface should be displayed so that it is clearlyvisible to the user, is not occluded by the user or other objects in theenvironment, and is convenient for user interaction. It should be notedthat there could be multiple users in the environment, but only one usermight be using the interface.

[0035] The user-following interface system 200 of FIG. 2 employs asteerable projector 260 as a way to move the displayed image to theappropriate location based on the user position. In an embodiment, thesteerable projector 260 consists of a combination of an Liquid CrystalDisplay (LCD) projector and a steerable mirror, as described inPinhanez, already incorporated by reference above. However, anyprojector or projector system that can project onto multiple surfaces inan environment may be used. The projector-mirror combination of Pinhanezcan project images onto different surfaces in an environment, thuscreating a displayed image on any ordinary surface or object. However,when an image is projected onto a surface that is not orthogonal to boththe axis of projection of the projector and the viewing direction of theuser, the projected image appears distorted to the user. In order toeliminate this distortion, a graphics controller 265 pre-warps orpre-distorts the image to be projected such that it appears undistortedto the user. This is also described in Pinhanez.

[0036] Since a benefit of the user-following interface system 200 ofFIG. 2 is to have an interface follow the user, the parameters 235 tothe steerable projector 260, as well as the parameters 240 of thegraphics controller 265, should be changed dynamically based on userposition. The user-following interface controller 225 uses theaforementioned geometric reasoning to select a surface on which an imageis to be displayed, and the size and orientation of the displayed image,depending on user location and orientation. The user-following interfacecontroller 225 then determines the orientation, zoom, and focusparameters 235 for the steerable projector 260 to project the image ontothe selected surface. The user-following interface controller 225 alsodetermines the warping parameters 240 for the graphics controller 265 sothat the image to be displayed is pre-distorted to appear at theappropriate orientation and size to the user.

[0037] Besides creating a displayed image on any surface in anenvironment based on user location, the user-following interface system200 of FIG. 2 also provides for interaction with the displayed image sothat the displayed image acts as an interactive interface to anapplication. In the present embodiment, the user interacts withprojected images by means of hand gestures. These user interactions aredetected by means of a steerable camera 280 and an interaction detectormodule 275. The camera 280 is steered to view the displayed imageproduced by the steerable projector 260. The images 288 from the camera280 are analyzed by the interaction detector 275 to detect userinteractions. Detected user interactions 292 are passed onto theapplication 294 to which the user-following interface system acts as theinterface.

[0038] As the steerable camera 280 should continually view the displayedimage for user interactions, the orientation, zoom, and focus parameters255 of this camera also should be varied dynamically based on userposition. Additionally, the interaction detector 275 should detect userinteraction in different interaction areas or widgets in the displayedimage. As the position, orientation, and size of the displayed imagechange dynamically, the appearance of the display are contained in thecamera images 288 changes dynamically, as well as the position, size,and orientation of the interaction widgets in the camera images 288. Toenable detection of interaction by the interaction detector 275 in spiteof these variations, the interaction detector is provided with warpingparameters that can be used to map a representation of the widgets in acanonical orthogonal view to the view seen in the camera images 288 fromthe camera. The interaction detector 275 then performs analysis forinteraction detection in the camera images 288.

[0039] Hence, appropriate orientation, zoom, and focus parameters 255should be provided to the steerable camera 280 to keep the currentdisplayed image in view, and warping parameters 250 should be providedto the interaction detector 250 to enable detection of interaction fromthe current image view. This is similar to providing orientation, zoomand focus parameters 235 to the steerable projector 260 and warpingparameters 240 to the graphics controller 265. The user-followingcontroller 225 determines the parameters 255, 250 along with theparameters 235, 240 based on the aforementioned geometric reasoning.

[0040] It should be noted that while the area for interaction by theuser is typically the same as the display area, this need not, ingeneral, be the case. The user-following interface system of FIG. 2 canposition the display and interaction areas to be fully overlapping,partially overlapping, or non-overlapping. For example, when the user istoo far from a displayed image, the system can provide an interactionarea on a surface close to the user. User interactions on this surfacethen translate into selections and actions related to the displayedimage. The user-following interface controller 225 determines thelocation of both display and interaction areas based on user positionand accordingly specifies the parameters 235, 240, 250 and 255.

[0041] Finally, the content of the interface may also be varied based onthe user position. For example, if the user is on the right of a largedisplayed image of a video player application, the interaction buttonssuch as “play” and “rewind” should be on the right hand side of thedisplayed image to be accessible by the user. On the other hand, if theuser is on the left hand side, the buttons should also be on the lefthand side. The number of buttons and the size of the text on a buttonmay also have to be changed based on user position. The user-followinginterface system of FIG. 2 has an interface adapter module 270 thatdynamically changes the actual interface content. The interface adapter270 receives an interface definition 290 from the application 294. Theinterface definition 290 provides the interaction widgets and theircanonical views and locations. These are generally application-specific.For example, these widgets may be “play,” “pause,” “forward,” and“rewind” buttons in a video player application. The interface definition290 may also include a set of alternate definitions that the interfaceadapter 270 can choose from. The interface adapter receives parameters245 that specify the user distance from the displayed image, theorientation of the user with respect to the displayed image, and thesize of the user's hand as seen in the camera images 288. Theseparameters 245 are provided to the interface adapter 270 by theuser-following interface controller 225.

[0042] The interface adapter 270 modifies the interface definition basedon these parameters. An adapted interface definition 284 is passed ontothe graphics controller 265 and an adapted interface definition 286 ispassed onto the interaction detector 275. The graphics controller 265accordingly adapts the content of the projected image (such as positionof buttons, and the text on buttons). The adapted interface definition286 also determines where the interaction detector 275 should look forinteractions in the camera images 288 and the specific interactions tolook for.

[0043] It is to be appreciated that user-following interface system 200can be implemented by one or more computers. For instance, theuser-following interface system 200 can be implemented by computersystem 299. A computer system generally comprises one or more processorsand a memory. Additionally, the computer system may be coupled to anetwork and may receive programming or updates from the network.Instructions necessary for causing the computer system to carry out thesteps required to perform the functions of user-following interfacesystem 200 may be part of an article of manufacture, such as a compactdisk. Additionally, all or part of user-following interface system 200can be embodied in hardware, such as through a custom semiconductorchip. Furthermore, although user-following interface system 200 is shownas being one device, portions of the user-following interface system 200can be separated from the system. For example, the user tracking module210 could execute on one computer system, while the user-followinginterface controller 225 could execute on a second computer system, andthe position and orientation 215 could be passed through a networkconnecting the two computer systems.

[0044] Furthermore, it should be appreciated that multiple steerablecameras 280 and steerable projectors 260, and any associated controllersor equipment, may be used in an implementation of the present invention.For instance, if it is desired that an interaction area comprisedisplayed images separate from displayed images in a display area, thenit is beneficial to have two steerable projectors 260. One steerableprojector 260 would be used to project the displayed images in thedisplay area, and the second steerable projector 260 would be used toproject the displayed images in the interaction area.

[0045] When there are multiple “users” in an environment, where a “user”is defined as a detected person, the user-following interface system 200can select one or more of the users to be the person or persons beingtracked. The selected person will also generally be selected as theperson who will be using the interface. Optionally, the selected personcould be input by the currently tracked user, so that the currentlytracked user can inform the user-following interface system 200 as towhich user is to be the new tracked user.

[0046]FIG. 3 shows a flowchart of a method 300 generally executed by,for instance, the user-following interface controller 225 from FIG. 2.Reference to FIG. 2 is beneficial. Method 300 is a method to determinecontrol parameters provided to the steerable cameras and projectors inthe system. The user-following interface controller 225 receives thecurrent user position and orientation parameters 215 from the usertracking module 210. Then in step 310, the user-following interfacecontroller 225 queries the environment model 230 and retrieves all thesurfaces in the environment model 230 that are within a range ofdistance between D_(max) and D_(min) from the user position and withinan angle O_(max) and O_(min) from the user orientation. The parametersD_(max), D_(min), O_(max), O_(min) are experimentally predetermined fora specific context or provided dynamically by an application.

[0047] In step 320, the user-following interface controller selects thesurface closest to the user among the list of surfaces retrieved in step310. In step 330, the user-following interface controller determines adisplay zone (also called the “display area”) on the selected surface.Illustratively, this is done by geometric reasoning involving a)estimating the intersection of a rectangular viewing pyramid from theuser position with the selected surface, which completely intersects thesurface while minimizing the distance from the user to the center of theintersection area; b) finding the maximal coverage of the intersectionarea by the pyramid of projection from the projector; and c) finding themaximal rectangle within the coverage area that is aligned with theviewing pyramid of the user. This geometric reasoning is illustrated inFIG. 4, described below.

[0048] In step 340, the user-following interface controller 225 checksif the resultant display area is large enough for user viewing. This isdone by verifying that the viewing angle for the user for the resultingdisplay rectangle is greater than a minimum. If the display area doesnot pass the check in step 340, in step 370, the current surface isremoved from the list of surfaces provided to step 320 and the processrepeated from step 320. If the estimated display area passes the checkin step 340, the orientation, pan, tilt, and zoom parameters of theprojector are estimated in step 350. This is done based on the geometricreasoning performed in step 330, where the axis of projection and theviewing frustum of the projector are determined. The orientation of theaxis, the distance from the center of projection to the selected displayarea, and the angle of the viewing frustum of the projector are mappedto the control parameters for the projector's pan, tilt, zoom and focus,based on a prior calibration of the projector.

[0049] Then, in step 360, the user-following interface controller 225further verifies that the selected display area is not occluded by theuser or other objects in the environment. This is done by verifying thata) the estimated viewing frustum of the projector does not intersect thebounding volume around the user; and b) that there is no surface in theenvironment model, other than the selected surface, that intersects theviewing frustum of the projector.

[0050] If the selected display area does not pass the occlusion check,then in step 370, the selected surface is removed from the list ofsurfaces provided to step 320 and the process repeated from step 310.This is described below in more detail in reference to FIG. 5.

[0051] If the selected display area does pass the occlusion check instep 360, then in step 380, the user-following interface controllerestimates the projection pre-warping parameters in step 380. Asexplained earlier in the context of FIG. 2, the pre-warping parametersare used by the graphics controller 265 to pre-distort a projected imageso that it appears undistorted to the user. In step 360, theuser-following interface controller uses geometric reasoning todetermine the mapping from the corners of the selected display area instep 330 to the corresponding points on the projector's image plane.This maps the rectangle on the display area to a quadrilateral on theprojected image, which is typically not a rectangle. The user-followinginterface controller then computes the mathematical transformation orhomography that maps the rectangle on the display area to itscorresponding quadrilateral in the projected image plane. This mapping,when applied to an image that is to be projected, pre-distorts the imageso that it appears undistorted to the user. The parameters of thecomputed mathematical transformation are the warping parameters 240 thatare passed to the graphics controller. This is described in more detailin reference to FIG. 11, described below.

[0052] Next, in step 382, the user-following interface controller movesthe projected interface to the desired position by sending controlparameters 235 to the steerable projector. These are the pan, tilt,zoom, and focus parameters estimated in step 350.

[0053] In step 384, the user-following interface controller determinesthe interaction area corresponding to the display area selected in step330. The interaction area is the area in which a user interacts with theprojected interface. In the present embodiment, the user interacts withthe projected image by means of their hand. As explained earlier in thecontext of FIG. 2, the interaction area may be identical to the displayarea, or may be different and fully overlapping, partially overlappingor not overlapping with the display area, depending on applicationconstraints and the position of the user and the position of the cameraused for detecting user interaction. In step 384, geometric reasoningand application specific constraints are used to determine theinteraction area. Typically, this involves determining the maximalviewing rectangle aligned with the selected display area that is coveredby the viewing frustum from the camera. This is determined using cameracalibration parameters. If the resulting viewing rectangle is not largeenough, the rectangle is moved on the surface containing the displayarea or to a different surface in the environment model that is close tothe user and seen by the interaction detecting camera.

[0054] Next, in step 386, the user-following interface controllerdetermines the image warping parameters corresponding to the interactionarea. As explained earlier, the interaction detector 275 is providedwith these warping parameters that can be used to map a representationof interaction widgets in a canonical orthogonal view to the view seenin the images 288 from the camera. The interaction detector 275 thenperforms analysis for interaction detection in the camera images 288.The computation of the image warping parameters in step 386 is similarto the computation of the projection pre-warping parameters in step 380.In this case, the corners of the interaction area selected in step 384are mapped to the corresponding points on the image plane of theinteraction detecting camera. This maps the rectangle on the interactionarea to a quadrilateral on the camera image plane, which is typicallynot a rectangle. The user-following interface controller then computesthe mathematical transformation or homography that maps the rectangle onthe interaction area to its corresponding quadrilateral in the cameraimage plane. The parameters of the computed mathematical transformationare the warping parameters 250 that are passed to the interactiondetector. This is described below in more detail in reference to FIG.11.

[0055] In step 388, the user-following interface controller estimatesthe camera control parameters needed to move the interaction detectingsteerable camera to view the interaction area selected in step 384. Thisis based on the geometric reasoning performed in step 384 along with thecalibration data for the camera. The orientation of the axis of thecamera for the selected interaction area, the distance from the opticalcenter of the camera to the selected interaction area, and the angle ofthe viewing frustum of the camera are mapped to the control parametersfor the camera's pan, tilt, zoom and focus. These control parameters 255are passed to the camera.

[0056] Finally, in step 390 the user-following interface controllerdetermines the user parameters 245 relative to the selected display andinteraction areas and passes them onto the interface adapter. Theseparameters typically include the distance of the user, the viewing anglefor the user, and the orientation of the user relative to the selecteddisplay and interaction areas.

[0057] The geometric reasoning, used in step 330 of FIG. 3, isillustrated in FIG. 4. FIG. 4 shows a selected surface 410. Thecorresponding user position is at 420 and the projector is at 430. Theviewing pyramid for the user is indicated by 425 and the pyramid ofprojection of the projector by 435. The quadrilateral 440 indicates theintersection of the viewing pyramid of the user with the surface 410that completely intersects the surface while minimizing the distancefrom the user to the center of the intersection area. The quadrilateral450 indicates the intersection of the pyramid of projection from theprojector with the selected surface 410. 460 is the maximal rectanglewithin the intersection of the quadrilaterals 440 and 450 that isaligned with the viewing pyramid of the user.

[0058]FIG. 5 helps to illustrate steps of method 300 described above. Itis helpful to refer to FIG. 3 during the description of FIG. 5. Anexample of an occlusion check is illustrated in FIG. 5, which shows amodel 510 of an environment along with a projector location 520 and thebounding volume 530 around the position of the user, and a selecteddisplay area 540. In step 360 of FIG. 3, an occlusion check is made. Theuser-following interface controller 225 determines through geometricreasoning that the viewing pyramid 545 from the projector location 520to the display area 540 is occluded by the user bounding box 530. If theselected display area does not pass the occlusion check, then in step370, described above, the selected surface is removed from the list ofsurfaces provided to step 320 and the process repeated from step 310. InFIG. 5, for example, the display area 540 does not pass the occlusioncheck, and the user-following interface controller selects display area550 which passes the occlusion check and is close to the user. Inparticular, the viewing pyramid 555 from the projector location 520 tothe display area 550 is not occluded by the user bounding box 530.

[0059]FIG. 6 shows a flowchart of a method 600 implemented,illustratively, by the user tracking module 210 from FIG. 2. Method 600tracks a user. As indicated in FIG. 2, images from multiple cameras areprocessed by the user tracking module to yield the position andorientation of the user. FIG. 6 illustrates exemplary steps involved inprocessing images from one camera. Similar processing occurs in cameraimages from the remaining cameras. The results from these multipleanalyses is merged in step 680 as explained shortly.

[0060] Step 610 computes the difference between two consecutive imagesin the incoming image sequence 605 from a camera. Alternatively, step610 can compute the difference image between the current image from thecamera and an estimated “background image” of the observed scene. Theresulting difference image is then thresholded and filtered in step 620to yield a set of “foreground” regions produced by new objects movinginto the scene. Step 620 produces a list of detected foreground regionsthat is provided to step 630. Then in step 630, one of these regions isselected for further analysis. Initially, the selection is driven merelyby size constraints. Once tracking commences, the selection is driven bythe head detection parameters 665 from the previous image. In this case,the selection in step 630 involves searching for the region that isclose in position, size and shape to the region corresponding to a headdetection in the previous image.

[0061] Step 640 then involves checking if the selected region satisfiesa head-shape check. Illustratively, this step involves analyzing thebounding contour of the region, and checking if there exist twoprominent concavities with a convexity between them corresponding toshoulders and a head in between them. Next, it is verified that thebounding contour corresponding to the head region is close in shape toan ellipse. If these checks are not met, step 670 removes the regionfrom the list of foreground regions, and the process is repeated fromstep 630. If the head shape check is met in step 640, step 650 checks ifthere is sufficient flesh tone color in the selected region. If thischeck is also met, then step 660 estimates the location size, shape, andintensity distribution for the detected head region and passes on thesehead detection parameters 665 to step 680 and to step 630 for thesubsequent image. Step 660 also estimates the orientation of the head,or the pan and tilt parameters of the head with respect to the opticalaxis of the camera.

[0062] Step 680 combines the head detection parameters from multiplecameras observing the scene at the same time. Step 680 first matches theparameters, particularly the shape and color distribution of thedetected head regions from the different cameras. For the matching headregions from the multiple cameras, step 680 performs stereo using thecalibration parameters of the different cameras and obtains the bestestimate for the three-dimensional (3D) position of the head. Similarly,step 680 combines the estimates of head pan and tilt with respect toindividual cameras to estimate the 3D orientation of the head. In step685, this head position and orientation is matched with thecorresponding estimates in the previous image, and the resulting matchis then used to update the trajectory of a user in step 690. Step 690produces the user tracking module outputs on user position, orientationand trajectory. An example of a tracked user is shown in reference toFIG. 10.

[0063]FIG. 7 shows a flowchart of a method 700 for tracking interactionfrom a user. Method 700 is generally executed by, for example, theinteraction detector module 275 from FIG. 2. As indicated in FIG. 2, andin reference to FIG. 7, the inputs to the interaction detector module275 are camera images 288 from a steerable camera, an interfacedefinition 286, and warping parameters 250. These inputs are provided tothe method in step 705. The outputs of the interaction detector areinteraction events 292, which are sent to the application. In thisembodiment of the interaction detector module 275 and method 700, it isassumed that the user interacts with the interface by means of theirfinger. However, the user may interact with the interface through anydevice, such as a laser pointer, or through any body part.

[0064] Step 710 computes the difference between two consecutive imagesin the incoming camera images 288 from the steerable camera. Theresulting difference image is then thresholded and filtered in step 720to yield a set of “foreground” regions produced by new objects movinginto the scene. Then in step 730, the interaction detector searches theforeground regions for a finger tip. This is done by searching eachforeground region for a possible match with a precomputed template of afinger tip. In step 740, the interaction detector checks if a finger tiphas been found by verifying if there is an acceptable match between thefinger tip template and one of the foreground regions. If a match isfound, the trajectory of the finger tip, derived from finger tipdetection in previous images, is updated with the detection from thecurrent image. If no current trajectory exists, a new trajectory iscreated in step 750.

[0065] Next, in step 770, the current trajectory is analyzed forinteraction events. In order to determine interaction events, theinteraction detector requires a definition of the interface in imagespace. The interface definition includes the specification of theposition, shape, and size of interaction regions or widgets in an image,and a specification of the event to be detected—for example, a widgetselection resulting from a user moving his or her finger tip into thewidget region, and then stopping or withdrawing. Since the interfacedefinition 286 is in application space, this definition has to be mappedinto image space to support the analysis in step 770. This is done instep 760, where the interface definition 286, together with the warpingparameters 250 provided by the user-following interface controller, isused to map the interface definition from application space to imagespace. After the analysis in step 770, step 780 checks if there are anydetected interaction events. If so, step 790 sends the interactionevents to the application by packaging and transmitting the event in theappropriate format.

[0066] Next, step 795 updates or eliminates the current trajectory asneeded. For instance, if no finger tip was found in step 740 and acurrent finger tip trajectory exists from detections in previous frames,this existing trajectory is eliminated in step 795 so that a freshtrajectory is begun in the next round of processing. As seen in FIG. 7,after step 795 a new round of processing begins with step 710.

[0067] As described earlier, the content of the interface may also bevaried based on the user position. For example, if the user is on theright of a large displayed image of a video player application, theinteraction buttons such as “play” and “rewind” should be on the righthand side of the displayed image to be accessible by the user. On theother hand, if the user is on the left hand side, the buttons shouldalso be on the left hand side. The number of buttons and the size of thetext on a button may also have to be changed based on user position. Theuser-following interface system of FIG. 2 has an interface adaptermodule 270 that dynamically changes the actual interface content.

[0068]FIG. 8 shows an exemplary embodiment of a method 800 for modifyingcontent size and widgets. Method 800 is usually performed, for instance,by an interface adapter module 270. The interface adapter module 270receives an interface definition 290 from an application. The interfacedefinition 290 provides the interaction widgets and their canonicalviews and locations. These are generally application-specific. Theinterface definition 290 may also include a set of alternate definitionsthat the interface adapter module 270 can choose from. The interfaceadapter module 270 receives user parameters 245 that specify the userdistance from the displayed image, the orientation of the user withrespect to the displayed image, and the size of the hand of the user asseen in the camera images 288. These parameters 245 are provided to theinterface adapter module 270 by the user-following interface controller225.

[0069] In step 810, the interface adapter module 270 determines theeffective resolution of the displayed image with respect to the user.This is done by determining the viewing angle subtended by the user overthe display area using the parameters 245 and then using the interfacedefinition 290 to determine the number of pixels in the displayed image.The adapter determines the number of unit viewing angles per pixel basedon the distance of the user from the displayed image, the size of thedisplayed image and the number of pixels per displayed image. If theeffective resolution measured as the number of unit angles available perpixel is low, the visibility or readability of the displayed image tothe user is poor.

[0070] In step 820, the interface adapter module 270 verifies if all thedisplay content in the interface definition is suitable for viewing bythe user at the effective resolution determined in step 810. If thecontent is unsuitable such as the font being too small to be readable atthe effective resolution, or the thickness of a line or the size of aninteraction widget being too small, in step 830, the interface adaptermodule 270 modifies the content and widgets to suit the effectiveresolution. This is typically done based on alternate interface choicesor policies provided by an application. For example, the application mayspecify a set of alternate choices for the display content such as 1)“Sale” 2) “Sale for $5.99 on X”, and 3) “Sale items: A $3.99, D $10.99,X $5.99, Y $8.99; Select your choice”. In this example, the interfaceadapter module 270 selects one of these options based on the effectiveresolution. Thus, when the user is far away, he only sees a largedisplayed image with the message “Sale”, and as he comes closer to thedisplayed image, the second message appears highlighting one of the saleitems, and finally when the user is close enough, a detailed messageappears offering a list of sale items and the option of selecting one ofthe items for more information.

[0071] In step 840, the interface adapter module 270 determines if theinteraction areas specified in the current interface definition arereachable by the user. Initially, the interaction areas are placed atthe location specified by the default definition of the interface. Basedon the user parameters 245, the interface adapter module 270 determinesif each widget that the user has to interact with hand is reachable bythe user and whether it gets occluded during user interaction. Forexample, these widgets may be “play,” “pause,” “forward,” and “rewind”buttons in a video player application. In step 850, the interfaceadapter module 270 verifies if all interaction areas or widgets arereachable. If not, in step 860, the interface modifies the widgetdefinitions so that they are reachable by the user. For example, thebuttons in a video player application may move to the right side of adisplayed image, if the user is on the right and to the left side of thedisplayed image if the user is on the left to ensure that theseinteraction buttons are reachable by the user.

[0072] In step 870, the interface adapter module 270 updates theinterface definition based on the changes made in steps 830 and 860, andoutputs the updated display content definition 284 to the graphicscontroller 265, and outputs the updated interaction widget definition286 to the interaction detector 275.

[0073]FIG. 9 shows an example of a three-dimensional model of anenvironment 910 with several defined display areas 915, 920, 925, 930,935, 940, 945, 950, 955, 960, 965, 970, 975. An environment model 230can be determined from the environment 910, and such a model 230 willinclude descriptions of display areas 915, 920, 925, 930, 935, 940, 945,950, 955, 960, 965, 970, 975 along with descriptions of any potentiallyoccluding permanent or movable structures.

[0074]FIG. 10 shows an example of a user trajectory estimated fromimages from one camera. Such an analysis is provided by step 690 ofmethod 600 of FIG. 6. FIG. 10 indicates a user 1010, the detected headregion 1020 in the current image, and the trajectory 1030 resulting frommatches of head positions over successive images.

[0075]FIG. 11 illustrates the mapping of display and interaction areasfrom application space to real space, such as the mapping performed insteps 360 and 380, respectively, of FIG. 6. Display area 1110 is adisplay area definition in application space, H is the homography forthe selected surface, and H pre-warps the display area 1110 to adistorted image 1115. This image when projected by the projector appearsas a display area 1130 on the selected display area. The displayed imagein the display area 1130 will be substantially undistorted when viewedby a user. FIG. 11 further illustrates the mapping of an interactionarea from application space to real space. Interaction area 1120 is aninteraction area definition in application space, H_(v) is thehomography for the selected surface, and H_(v) pre-warps the interactionarea 1120 to the image 1125 seen by the camera. This image 1125 is whatthe camera sees corresponding to the interaction area 1140 on theselected surface in real space.

[0076] It is to be understood that the embodiments and variations shownand described herein are merely illustrative of the principles of thisinvention and that various modifications may be implemented by thoseskilled in the art without departing from the scope and spirit of theinvention.

What is claimed is:
 1. A user-following interface system comprising: oneor more projectors adapted to project one or more images onto one ormore selected surfaces in an environment, wherein the one or moreselected surfaces are selected from a plurality of surfaces; a usertracking module adapted to track positions of one or more users in theenvironment; and a controller coupled to the one or more projectors andthe user tracking module, the controller adapted to select the one ormore surfaces based on the positions of the one or more users in theenvironment, and to provide information to the one or more projectorssuitable for allowing the one or more projectors to project the one ormore images onto the one or more selected surfaces.
 2. Theuser-following interface system of claim 1, wherein the user trackingmodule is coupled to one or more tracking devices, wherein the usertracking module is further adapted to track orientations of the one ormore users by using information from the one or more tracking devices,and wherein the controller is further adapted to select the one or moresurfaces based on the orientations of the one or more users.
 3. Theuser-following interface system of claim 1, further comprising a modelof the environment, wherein the controller is coupled to the model andis adapted to access, from the model, information about surfaces andobjects in the environment.
 4. The user-following interface system ofclaim 1, wherein the controller is further adapted to identify occlusionof a projected image.
 5. The user-following interface system of claim 4,wherein the controller is further adapted to move the projected image toa surface that is not occluded.
 6. The user-following interface systemof claim 1, wherein a selected surface comprises a display area andwherein the controller is further adapted to select a display area foran image by: mapping the position and orientation of a selected userinto a three-dimensional environment model; determining a volume, in thethree-dimensional model, that is viewable by the selected user;determining a set of displayable surfaces in the viewable volume;determining available display areas on each surface; determining adistance from a head of the selected user to a center of an availabledisplay area; eliminating available display areas that are occluded bythe user to create remaining display areas; and selecting one of theremaining display areas, within the viewable volume, that is closest tothe selected user while being approximately orthogonal to a viewingdirection of the selected user.
 7. The user-following interface systemof claim 1, wherein the controller is further adapted to determinewarping parameters to prewarp an image to ensure it appearssubstantially undistorted to a user when the image is displayed on aselected surface.
 8. The user-following interface system of claim 1,wherein the controller is further adapted to determine an orientation ofa selected image when displayed on a selected surface, and to provideinformation to the one or more projectors so that the selected image,when displayed on a selective surface, looks upright and approximatelyorthogonal to a selected user.
 9. The user-following interface system ofclaim 1, wherein the controller is further adapted to determine a sizeof a projected image based on an available area on a display area and ona distance of the display area from a selected user.
 10. The userfollowing interface system of claim 1, wherein the user tracking moduleis coupled to one or more tracking devices, and wherein at least one ofthe tracking devices comprises one or more cameras that observe one ormore of the users in the environment.
 11. The user-following interfacesystem of claim 1, wherein the user tracking module is coupled to one ormore tracking devices, and wherein at least one of the tracking devicescomprises an electromagnetic device or an active badge worn by at leastone user.
 12. The user-following interface system of claim 10, whereinthe user tracking module is coupled to one or more tracking devices, andwherein user tracking module is further adapted to analyze camera imagesfrom the one or more cameras for one or more of motion, shape, and colorto determine presence of people in each camera image.
 13. Theuser-following interface system of claim 1, wherein the user trackingmodule is coupled to one or more tracking devices, wherein the at leastone tracking device is further adapted, when analyzing the cameraimages, to: find a difference between a current image from a selectedcamera and a previous image from the selected camera to creating adifference image; threshold and filter the difference image to removenoise and to yield foreground regions; select foreground regions withsufficient human skin color; search the selected foreground regions fora shape of a human head; and determining a position of the head in thedifference image.
 14. The user-following interface system of claim 10,wherein the user tracking module is further adapted to combine adetected location of an individual in multiple camera images andcalibration parameters of the cameras to yield a three-dimensionallocation of the individual in the environment.
 15. The user-followinginterface system of claim 13, wherein the at least one tracking deviceis further adapted to analyze a detected head shape in a camera image todetermine an orientation of the detected head shape, thereby determiningwhich way a selected user is facing.
 16. The user-following interfacesystem of claim 1, further comprising a sensing mechanism adapted todetect user interactions with a selected image projected onto one of theselected surfaces.
 17. The user-following interface system of claim 16,wherein the sensing mechanism comprises a camera adapted to observe theselected image.
 18. The user-following interface system of claim 16,wherein the sensing mechanism is further adapted to detect user handgesture interactions with the selected image, and wherein the sensingmechanism is further adapted to detect a variety of interaction modesresulting from the user hand gesture interactions.
 19. Theuser-following interface system of claim 16, wherein the sensingmechanism is further adapted to detect user interaction when the useruses a laser pointing device.
 20. The user-following interface system ofclaim 16, wherein an interaction area, comprising a first image, isindependent of a display area, comprising a second image, wherein theinteraction area and display area are completely separate, completelyoverlapping, or partially overlapping.
 21. The user-following interfacesystem of claim 16, wherein a configuration is defined by a user orapplication in application space and where the controller is furtheradapted to automatically determine a view by a camera of theconfiguration for any surface that a selected image is projected on, sothat no separate configurations for each camera view need to be defined.22. The user-following interface system of claim 16, wherein thecontroller is further adapted to select an interaction area for an imageby: mapping position and orientation of the user into athree-dimensional environment model; determining a volume in thethree-dimensional model that is reachable by the user; determining a setof surfaces available for interaction in the reachable volume;determining available interaction areas on each surface; determining adistance from an appendage of the user to a center of an availableinteraction area; eliminating interaction areas that are occluded by theuser to create remaining interaction areas; and selecting one of theremaining interaction areas, within the reachable volume, that isclosest to the user while being approximately orthogonal to a viewingdirection of the user.
 23. The user-following interface system of claim16, wherein the controller determines warping parameters that map aninteraction area to a camera image of the interaction area.
 24. Theuser-following interface system of claim 1, further comprising aninterface adapter adapted to change content of the one or more imagesbased on user and surface parameters.
 25. The user-following interfacesystem of claim 24, wherein the interface adapter is further adapted todetermine an effective resolution of the one or more images with respectto a user.
 26. The user-following interface system of claim 24, whereinthe interface adapter is further adapted to change content of the one ormore images to be visible and readable by the user.
 27. Theuser-following interface system of claim 24, wherein the interfaceadapter is further adapted to modify one or more of size or position ofinteraction widgets in the one more images based on user position andorientation parameters.
 28. The user-following interface system of claim24, wherein the interface adapter is further adapted to modify a mode ofinteraction based on user position and orientation parameters.
 29. Theuser-following interface system of claim 24, wherein the interfaceadapter is further adapted to modify parameters of one or moreinteraction sensing mechanisms based on user position and orientationparameters.
 30. A method for creating a using a user-followinginterface, the method comprising the steps of: tracking positions of oneor more users in the environment; selecting one or more surfaces from aplurality of surface based on the position of the one or more users inthe environment; and projecting one or more images onto the one or moreselected surfaces, wherein the interface comprises the one or moreimages.
 31. An article of manufacture for creating a using auser-following interface, comprising: a computer-readable medium havingcomputer-readable code means embodied thereon, the computer-readableprogram code means comprising a step to track positions of one or moreusers in the environment; a step to select one or more surfaces from aplurality of surface based on the position of the one or more users inthe environment; and a step to project one or more images onto the oneor more selected surfaces, wherein the interface comprises the one ormore images.
 32. An apparatus for creating a using a user-followinginterface, comprising: at least one processor operable to: trackpositions of one or more users in the environment; select one or moresurfaces from a plurality of surface based on the position of the one ormore users in the environment; and project one or more images onto theone or more selected surfaces, wherein the interface comprises the oneor more images.